Speech Recognition Performance of Adults: A Proposal for a Battery for Telugu
نویسندگان
چکیده
Speech audiometry is an essential component of the audiological test battery, as it provides information concerning one’s sensitivity to speech stimuli and the understanding of speech at supra-threshold levels. With regard to the history of materials for speech audiometry, different kinds of materials have been developed by several investigators in English and non-English languages. Several such attempts have also been made to develop and standardize materials for speech audiometry in Indian languages. With reference to Telugu (South Indian Dravidian Language) no such material is available for measuring open-set speech recognition score in adults. Telugu is mother tongue of the majority of people of Andhra Pradesh (Southern State of India) which is divided into three regions. Although, the mother tongue of majority of people of Andhra Pradesh is Telugu, some of the most familiar and frequently used words in one region may not be familiar to people belonging to other regions due to dialectal variations. The purpose of this study is to develop speech material in Telugu which can be commonly used to assess speech recognition performance of individuals belonging to three regions. Four lists of bisyllabic words in Telugu were developed and equivalence analysis of difficulty between the word lists was evaluated for three groups (from three regions) of subjects (age range of 18-25 years) with normal hearing. Subsequently, performance intensity (PI) function for each list was also measured for the three groups. The results revealed that there was no significant difference (p<0.05) between scores obtained by three groups for each list and between four lists for each group. The four word lists developed were found to be equally difficult for all the groups. The performance-intensity (PI) function curve showed semi linear function, and the linear portion of the curve indicated an average linear slope showing 4.64%, 4.62%, 4.52% and 4.54% increase in word recognition score per dB for list 1, list 2, list 3 and list 4 respectively and were found be in accordance with the findings of earlier studies. The four lists thus developed were found to have sufficient reliability and validity in assessing speech recognition performance.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملPresentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition
Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کامل